36 research outputs found

    Optimal Loop-Unrolling Mechanisms and Architectural Extensions for an Energy-Efficient Design of Shared Register Files in MPSoCs

    Get PDF
    In this paper we introduce a new hardware/software approach to reduce the energy of the shared register file in upcoming embedded architectures with several VLIW processors. This work includes a set of architectural extensions and special loop unrolling techniques for the compilers of MPSoC platforms. This complete hardware/software support enables reducing the energy consumed in the register file of MPSoC architectures up to a 60% without introducing performance penalties

    Dynamic Memory Management Design Methodology for Reduced Memory Footprint in Multimedia and Wireless Network Applications

    Get PDF
    New portable consumer embedded devices must execute multimedia and wireless network applications that demand extensive memory footprint. Moreover, they must heavily rely on Dynamic Memory (DM) due to the unpredictability of the input data (e.g. 3D streams features) and system behaviour (e.g. number of applications running concurrently defined by the user). Within this context, consistent design methodologies that can tackle efficiently the complex DM behaviour of these multimedia and network applications are in great need. In this paper, we present a new methodology that allows to design custom DM management mechanisms with a reduced memory footprint for such kind of dynamic applications. The experimental results in real case studies show that our methodology improves memory footprint 60% on average over current state-of-the-art DM managers

    Memory-Access-Aware Data Structure Transformations for Embedded Software with Dynamic Data Accesses

    Get PDF
    Embedded systems are evolving from traditional, stand-alone devices to devices that participate in Internet activity. The days of simple, manifest embedded software [e.g. a simple finite-impulse response (FIR) algorithm on a digital signal processor DSP)] are over. Complex, nonmanifest code, executed on a variety of embedded platforms in a distributed manner, characterizes next generation embedded software. One dominant niche, which we concentrate on, is embedded, multimedia software. The need is present to map large scale, dynamic, multimedia software onto an embedded system in a systematic and highly optimized manner. The objective of this paper is to introduce high-level, systematically applicable, data structure transformations and to show in detail the practical feasibility of our optimizations on three real-life multimedia case studies. We derive Pareto tradeoff points in terms of accesses versus memory footprint and obtain significant gains in execution time and power consumption with respect to the initial implementation choices. Our approach is a first step to systematically applying high-level data structure transformations in the context of memory-efficient and low-power multimedia systems

    Methodology for Refinement and Optimization of Dynamic Memory Management for Embedded Systems in Multimedia Applications

    Get PDF
    In multimedia applications, run-time memory management support has to allow real-time memory de/allocation, retrieving and processing of data. Thus, its implementation must be designed to combine high speed, low power, large data storage capacity and a high memory bandwidth. In this paper, we assess the performance of our new system-level exploration methodology to optimize the memory management of typical multimedia applications in an extensively used 3D reconstruction image system. This methodology is based on an analysis of the number of memory accesses, normalized memory use 1 and energy estimations for the system studied. This results in an improvement of normalized memory footprint up to 44.2% and the estimated energy dissipation up to 22.6% over conventional static memory implementations in an optimized version of the driver application. Finally, our final version is able to scale perfectly the memory consumed in the system for a wide range of input parameters whereas the statically optimized version is unable to do this

    Power Aware Tuning of Dynamic Memory Management for Embedded Real-Time Multimedia Applications

    Get PDF
    In the near future, portable embedded devices must run multimedia applications with enormous computational requirements at low energy consumption. These applications demand extensive memory footprint and must rely on dynamic memory due to the unpredictability of input data (e.g. 3D streams features) and system behaviour (e.g. variable number of applications running concurrently). Within this context, the dynamic memory subsystem is one of the main sources of power consumption and embedded systems have very limited batteries to provide efficient general-purpose dynamic memory management. As a result, consistent design methodologies that can efficiently tackle the complex dynamic memory behaviour of these new applications for low power embedded systems are in great need. In this paper we propose a step-wise system-level approach that allows the design of platform-specific dynamic memory management mechanisms with low power consumption for such kind of dynamic applications. The experimental results in reallife case studies show that our approach improves power consumption up to 89% over current state-of-the-art dynamic memory managers for complex applications

    Automated Exploration of Pareto-optimal Configurations in Parameterized Dynamic Memory Allocation for Embedded Systems

    Get PDF
    New applications in embedded systems are becoming Increasingly dynamic. In addition to increased dynamism, they have massive data storage needs. Therefore, they rely heavily on dynamic, run-time memory allocation. The design and configuration of a dynamic memory allocation subsystem requires a big design effort, without always achieving the desired results. In this paper, we propose a fully automated exploration of dynamic memory allocation configurations. These configurations are fine tuned to the specific needs of applications with the use of a number of parameters. We assess the effectiveness of the proposed approach in two representative real-life case studies of the multimedia and wireless network domains and show up to 76% decrease in memory accesses and 66% decrease in memory footprint within the Pareto-optimal trade-off space

    An integrated hardware/software approach for run-time scratch-management

    Get PDF
    An ever increasing number of dynamic interactive applications are implemented on portable consumer electronics. Designers depend largely on operating systems to map these applications on the architecture. However, today’s embedded operating systems abstract away the precise architectural details of the platform. As a consequence, they cannot exploit the energy efficiency of scratchpad memories. We present in this paper a novel integrated hardware/software solution to support scratchpad memories at a high abstraction level. We exploit hardware support to alleviate the transfer cost from/to the scratchpad memory and at the same time provide a high-level programming interface for run-time scratchpad management. We demonstrate the effectiveness of our approach with a case-study

    Reducing Memory Fragmentation with Performance-optimized Dynamic Memory Allocators in Network Applications

    Get PDF
    The needs for run-time data storage in modern wired and wireless network applications are increasing. Additionally, the nature of these applications is very dynamic, resulting in heavy reliance to dynamic memory allocation. The most significant problem in dynamic memory allocation is fragmentation, which can cause the system to run out of memory and crash, if it is left unchecked. The available dynamic memory allocation solutions are provided by the real time Operating Systems used in embedded or general-purpose systems. These state-of-the-art dynamic memory allocators are designed to satisfy the run-time memory requests of a wide range of applications. Contrary to most applications, network applications need to allocate too many different memory sizes (e.g. hundreds different sizes for packets) and have an extremely dynamic allocation and de-allocation behavior (e.g. unpredictable web-browsing activity). Therefore, the performance and the de-fragmentation efficiency of these allocators is limited. In this paper, we analyze all the important issues of fragmentation and the ways to reduce it in network applications, while keeping the performance of the dynamic memory allocator unaffected or even improving it. We propose highly customized dynamic memory allocators, which can be configured for specific network needs. We assess the effectiveness of the proposed approach in two representative real-life case studies of wired and wireless network applications. Finally, we show very significant reduction in memory fragmentation and increase in performance compared to state-of-the-art dynamic memory allocators utilized by real-time Operating System

    Some Experiences on Dynamic Memory Management Refinement at System-Level for Multimedia Applications

    Get PDF
    Nowadays, 3D multimedia applications have grown rapidly in number and consist of complex systems (e.g. 3D graphical processing or games) that process extensive amounts of data to create 3D images and results. This produces highcost and high-power consumption systems whereas a superior portability demands cheap and low-power consumption ones. In these multimedia applications, the dynamic memory subsystem is currently one of the main sources of power consumption and its inattentive management can affect severely the performance and power consumption of the whole system. In this paper, we illustrate a new system-level method to explore and refine the dynamic memory management of multimedia systems on current typical case studies, i.e. a relatively new 3D image reconstruction system and a 3D simulation game. This method is based on an analysis of the access pattern, amount of memory used and power consumption estimations. With this information, a phasewise exploration and refinement flow is used to optimize the system at the different phases of its hardware-oriented design process. As the results in the case studies show, our system-level method achieves great improvements in memory footprint, power consumption and performance for multimedia applications

    Systematic Design Flow for Dynamic Data Management in Visual Texture Decoder of MPEG-4

    Get PDF
    There is a clear trend of future embedded systems in moving toward wireless, multimedia, multi-functional and ubiquitous applications. This emerges new challenges in the existing solutions on performance, power, flexibility and costs, calling for innovations in both architecture and design methodology. In this paper we propose a design flow consisting of three stages to handle dynamic data, allowing the designer to create highly customized dynamic memory managers, make them bank-aware and create a design-time schedule of the different tasks of the application. We evaluated the proposed flow using the Visual Texture Coding (VTC) application, mapping it on a dual processor embedded platform achieving 5.5% reduction in memory footprint and 10% gains in execution time
    corecore